knitr::opts_chunk$set(
  echo = FALSE,
  message = FALSE,
  warning = FALSE,
  error = FALSE, 
  collapse = TRUE,
  comment = "",
  fig.height = 8,
  fig.width = 12,
  fig.align = "center",
  cache = FALSE
)

Instructions

Exercise

I love a sunburnt country

A land of sweeping plains

Of ragged mountain ranges

Of droughts and flooding rains

(From the poem My Country by Dorothea MacKellar)

In this last week there have been many news stories about an ongoing dry period in large parts of Australia. In the west of NSW, large parts of Queensland, and even the Gippsland region of Victoria it is reported that there has been insufficient rain for many months and this is severely affecting many farmers. This assignment is designed to examine what is happening using publicly available data.

Data collection

This is what I have done to get the data to this point. The Global Historical Climate Network maintained by NOAA curates weather records for stations across the globe.

  • You can get the list of stations and their latitude and longitude from the ghcnd-stations.txt file from the raw files site https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/. Australian stations have “ASN” prefixing the station id. There are 17219 recording stations across Australia. Some of them are on Antartica, remote islands, and possibly on Naval ships because the locations of these few are far from the mainland.

  • The data is stored in a single file for each station. The Australian Bureau of Meteorolgy links to this site for Australian data, and you can download station by station from their site, but this is a very inefficient interface. We can get multiple files read by scripting it in R. (You could pay BOM to download your preferred summaries.) I pulled weather data from stations around Victoria as a first pass, and checked the precipitation of the most recent records. It was clear quickly that a lot of stations have not had their data updated recently on this database. So I needed a way to get just the data from stations that had measurements for this year, all the way through to August. The web page with the list of files available for all stations has a “modified date” for each file, so I extracted this information and used it to select only stations in Australia that had data modified this month.

  • The next step is to go file by file, and pull the data for these stations, combine it into a single data file. This is the same as what we did for the Melbourne weather station during a past class.

Your don’t need to repeat what I have done. Start by reading in the data that is already created.

Your tasks

  1. Use your web surfing skills and find an article on the CURRENT drought in Australia. Report the link to the article and write a few sentences summarising its main points.

Various possibilities here

  1. Read in the raw data and put it into tidy form. The code is provided, with a few spots where you need to fill in the functions. Where you see ??? in the code, is where you need to fill in to make it work.

    1. What value is used in the raw data to indicate a missing value? -9999
    2. Why do the precipitation values divided by something? It was reported in tenths of mm, with the decimal point dropped.
  2. Compute the monthly precipitation for each year. (You need to sum up the precipitation for each month, for each year.) When working with precipitation it is important to summarise using totals. This is different from working with temperature, where we would typically summarise using means. Why should you use totals to summarise precipitation?

  3. Make a line plot of monthly precipitation by month, grouping by station, for 2018. Overlay a smoother. Is there generally a decreasing trend in rainfall this year across the country?

The first few months saw a lot of rain at a few locations. There was les rain in recent months. It is not clear if this is the usual pattern across these stations for the first 7 months of the year.

  1. Compute the long term average monthly precipitation for each station. (Use your previous summary and then average the values by month.) This is going to be a baseline for comparing precipitation this year.

  1. Now we want to look at the general pattern of rainfall this year, whether it is higher of lower than expected. Compute the relative change between monthly precipitation of the first 7 months of this year, 2018, and the long term monthly average precipitation, for each station. Plot this difference on a map of Australia. (Make sure you use an appropriate colour scale.) To calculate relative change, use this formula:

\[Relative~change~ = \frac{T_{2018}-T_{longterm}}{T_{longterm}}\]

(While you have this map, make it interactive with plotly, so that the station id pops up on mouseover. Find a station in the west of NSW or southwest of Qld, or even in Gippsland. Keep the id for this station for the next question.)

  1. This is another way to look at how this year compares with previous years. For each month, compute the percentile of this year’s rainfall. That is count the number of years since 1950 that this month had lower precipitation than this year. Find a location (or few) where this year’s rainfall has been consistently low for each month (Jan-Jul), relative to other years.
# A tibble: 651 x 5
   station     month nless     n   pct
   <chr>       <dbl> <int> <int> <dbl>
 1 ASN00001019     3     0    19     0
 2 ASN00001019     6     0    19     0
 3 ASN00001019     7     0    19     0
 4 ASN00002012     1     0    67     0
 5 ASN00002012     2     0    67     0
 6 ASN00002012     3     0    67     0
 7 ASN00002012     5     0    67     0
 8 ASN00002012     6     0    67     0
 9 ASN00002012     7     0    67     0
10 ASN00003003     5     0    67     0
11 ASN00003003     6     0    67     0
12 ASN00004032     5     0    67     0
13 ASN00004032     7     0    67     0
14 ASN00005007     3     0    59     0
15 ASN00005007     4     0    62     0
16 ASN00006011     1     0    67     0
17 ASN00006011     3     0    67     0
18 ASN00006011     4     0    67     0
19 ASN00007176     7     0    42     0
20 ASN00008050     2     0    46     0
21 ASN00008051     3     0    67     0
22 ASN00008051     4     0    67     0
23 ASN00009021     3     0    67     0
24 ASN00009193     3     0    29     0
25 ASN00009193     5     0    28     0
26 ASN00009518     3     0    67     0
27 ASN00009789     4     0    48     0
28 ASN00009789     5     0    48     0
29 ASN00009789     6     0    48     0
30 ASN00010286     3     0    21     0
31 ASN00010917     3     0    19     0
32 ASN00011003     3     0    66     0
33 ASN00013017     3     0    61     0
34 ASN00013017     5     0    61     0
35 ASN00013017     7     0    61     0
36 ASN00014040     5     0    43     0
37 ASN00014932     3     0    39     0
38 ASN00014932     5     0    37     0
39 ASN00014932     6     0    37     0
40 ASN00014932     7     0    38     0
41 ASN00015135     3     0    48     0
42 ASN00015135     5     0    48     0
43 ASN00015135     6     0    48     0
44 ASN00015135     7     0    49     0
45 ASN00015548     4     0    44     0
46 ASN00015548     5     0    44     0
47 ASN00015548     6     0    44     0
48 ASN00015548     7     0    44     0
49 ASN00015590     3     0    67     0
50 ASN00015590     5     0    67     0
# ... with 601 more rows
# A tibble: 22 x 2
   station         s
   <chr>       <dbl>
 1 ASN00002012 0.254
 2 ASN00006011 1.72 
 3 ASN00009789 1.29 
 4 ASN00014932 1.69 
 5 ASN00015135 1.48 
 6 ASN00015548 1.58 
 7 ASN00015590 1.58 
 8 ASN00017043 1.19 
 9 ASN00023031 1.42 
10 ASN00024580 1.98 
# ... with 12 more rows
# A tibble: 7 x 5
  station     month nless     n    pct
  <chr>       <dbl> <int> <int>  <dbl>
1 ASN00044021     1     0    67 0     
2 ASN00044021     2    28    67 0.418 
3 ASN00044021     3     6    67 0.0896
4 ASN00044021     4    47    67 0.701 
5 ASN00044021     5     8    67 0.119 
6 ASN00044021     6    32    67 0.478 
7 ASN00044021     7    14    67 0.209 
# A tibble: 7 x 5
  station     month nless     n    pct
  <chr>       <dbl> <int> <int>  <dbl>
1 ASN00043109     1     1    20 0.05  
2 ASN00043109     2    16    20 0.8   
3 ASN00043109     3     0    20 0     
4 ASN00043109     4     0    20 0     
5 ASN00043109     5     0    21 0     
6 ASN00043109     6     7    21 0.333 
7 ASN00043109     7     1    21 0.0476
# A tibble: 7 x 5
  station     month nless     n    pct
  <chr>       <dbl> <int> <int>  <dbl>
1 ASN00048027     1    13    55 0.236 
2 ASN00048027     2     1    55 0.0182
3 ASN00048027     3     0    55 0     
4 ASN00048027     4    14    55 0.255 
5 ASN00048027     5     5    56 0.0893
6 ASN00048027     6    21    56 0.375 
7 ASN00048027     7     1    56 0.0179
# A tibble: 7 x 5
  station     month nless     n    pct
  <chr>       <dbl> <int> <int>  <dbl>
1 ASN00085298     1    32    47 0.681 
2 ASN00085298     2    13    47 0.277 
3 ASN00085298     3    13    48 0.271 
4 ASN00085298     4     3    47 0.0638
5 ASN00085298     5    17    48 0.354 
6 ASN00085298     6    15    48 0.312 
7 ASN00085298     7    30    49 0.612 

  1. Find another historical long period of well below normal rainfall for the first half of the year, for this station (or another). Do a web search to see if there were any news stories about that drought period.
# A tibble: 68 x 2
    year     s
   <int> <dbl>
 1  1965  67.5
 2  2018  96.8
 3  1972 108. 
 4  1993 115. 
 5  1992 118. 
 6  1985 121  
 7  2017 140. 
 8  1964 141. 
 9  1970 169. 
10  2013 172. 
# ... with 58 more rows

The 1965 drought in Eastern Australia is described on wikipedia

Grading

Points for the assignment will be based on: